Statistical topological data analysis using persistence landscapes
نویسنده
چکیده
We define a new topological summary for data that we call the persistence landscape. In contrast to the standard topological summaries, the barcode and the persistence diagram, it is easy to combine with statistical analysis, and its associated computations are much faster. This summary obeys a Strong Law of Large Numbers and a Central Limit Theorem. Under certain finiteness conditions, this allows us to calculate approximate confidence intervals for the expected total squared persistence. With these results one can use t-tests for statistical inference in topological data analysis. We apply these methods to numerous examples including random geometric complexes, random clique complexes, and Gaussian random fields. We also show that this summary is stable and gives lower bounds for the bottleneck distance and the Wasserstein distance.
منابع مشابه
A persistence landscapes toolbox for topological statistics
Topological data analysis provides a multiscale description of the geometry and topology of quantitative data. The persistence landscape is a topological summary that can be easily combined with tools from statistics and machine learning. We give efficient algorithms for calculating persistence landscapes, their averages, and distances between such averages. We discuss an implementation of thes...
متن کاملOn the Bootstrap for Persistence Diagrams and Landscapes
Persistent homology probes topological properties from point clouds and functions. By looking at multiple scales simultaneously, one can record the births and deaths of topological features as the scale varies. In this paper we use a statistical technique, the empirical bootstrap, to separate topological signal from topological noise. In particular, we derive confidence sets for persistence dia...
متن کاملStatistical Topology Using the Nonparametric Density Estimation and Bootstrap Algorithm
This paper presents approximate confidence intervals for each function of parameters in a Banach space based on a bootstrap algorithm. We apply kernel density approach to estimate the persistence landscape. In addition, we evaluate the quality distribution function estimator of random variables using integrated mean square error (IMSE). The results of simulation studies show a significant impro...
متن کاملTopological Data Analysis of Single - Trial Electroencephalographic Signals
Epilepsy is a neurological disorder that can negatively affect the visual, audial and motor functions of the human brain. Statistical analysis of neurophysiological recordings, such as electroencephalogram (EEG), facilitates the understanding and diagnosis of epileptic seizures. Standard statistical methods, however, do not account for topological features embedded in EEG signals. In the curren...
متن کاملConformational ensembles and sampled energy landscapes: Analysis and comparison
We present novel algorithms and software addressing four core problems in computational structural biology, namely analyzing a conformational ensemble, comparing two conformational ensembles, analyzing a sampled energy landscape, and comparing two sampled energy landscapes. Using recent developments in computational topology, graph theory, and combinatorial optimization, we make two notable con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 16 شماره
صفحات -
تاریخ انتشار 2015